{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TBFXQGKYUc4X"
      },
      "source": [
        "##### Copyright 2018 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "1z4xy2gTUc4a"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FE7KNzPPVrVV"
      },
      "source": [
        "# Image Classification using tf.keras"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KwQtSOz0VrVX"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/examples/blob/master/courses/udacity_intro_to_tensorflow_for_deep_learning/l05c03_exercise_flowers_with_data_augmentation.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/examples/blob/master/courses/udacity_intro_to_tensorflow_for_deep_learning/l05c03_exercise_flowers_with_data_augmentation.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gN7G9GFmVrVY"
      },
      "source": [
        "In this Colab you will classify images of flowers. You will build an image classifier using `tf.keras.Sequential` model and load data using `tf.keras.preprocessing.image.ImageDataGenerator`.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zF9uvbXNVrVY"
      },
      "source": [
        "# Importing Packages"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VddxeYBEVrVZ"
      },
      "source": [
        "Let's start by importing required packages. **os** package is used to read files and directory structure, **numpy** is used to convert python list to numpy array and to perform required matrix operations and **matplotlib.pyplot** is used to plot the graph and display images in our training and validation data."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "rtPGh2MAVrVa"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import numpy as np\n",
        "import glob\n",
        "import shutil\n",
        "\n",
        "import tensorflow as tf\n",
        "\n",
        "import matplotlib.pyplot as plt"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Jlchl4x2VrVg"
      },
      "source": [
        "### TODO: Import TensorFlow and Keras Layers\n",
        "\n",
        "In the cell below, import Tensorflow as `tf` and the Keras layers and models you will use to build your CNN. Also, import the `ImageDataGenerator` from Keras so that you can perform image augmentation."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "L1WtoaOHVrVh"
      },
      "outputs": [],
      "source": [
        "#import packages"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UZZI6lNkVrVm"
      },
      "source": [
        "# Data Loading"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DPHx8-t-VrVo"
      },
      "source": [
        "In order to build our image classifier, we can begin by downloading the flowers dataset. We first need to download the archive version of the dataset and after the download we are storing it to \"/tmp/\" directory."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_lPjfOmNVrVs"
      },
      "source": [
        "After downloading the dataset, we need to extract its contents."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "OYmOylPlVrVt"
      },
      "outputs": [],
      "source": [
        "_URL = \"https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz\"\n",
        "\n",
        "zip_file = tf.keras.utils.get_file(origin=_URL,\n",
        "                                   fname=\"flower_photos.tgz\",\n",
        "                                   extract=True)\n",
        "\n",
        "base_dir = os.path.join(os.path.dirname(zip_file), 'flower_photos')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2yge5MKnnjMd"
      },
      "source": [
        "The dataset we downloaded contains images of 5 types of flowers:\n",
        "\n",
        "1. Rose\n",
        "2. Daisy\n",
        "3. Dandelion\n",
        "4. Sunflowers\n",
        "5. Tulips\n",
        "\n",
        "So, let's create the labels for these 5 classes: "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "FiYVs1MEmNHf"
      },
      "outputs": [],
      "source": [
        "classes = ['roses', 'daisy', 'dandelion', 'sunflowers', 'tulips']"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "G1ymuCPS0_eu"
      },
      "source": [
        "Also, the dataset we have downloaded has following directory structure.\n",
        "\n",
        "<pre style=\"font-size: 10.0pt; font-family: Arial; line-height: 2; letter-spacing: 1.0pt;\" >\n",
        "<b>flower_photos</b>\n",
        "|__ <b>daisy</b>\n",
        "|__ <b>dandelion</b>\n",
        "|__ <b>roses</b>\n",
        "|__ <b>sunflowers</b>\n",
        "|__ <b>tulips</b>\n",
        "</pre>\n",
        "\n",
        "As you can see there are no folders containing training and validation data. Therefore, we will have to create our own training and validation set. Let's write some code that will do this.\n",
        "\n",
        "\n",
        "The code below creates a `train` and a `val` folder each containing 5 folders (one for each type of flower). It then moves the images from the original folders to these new folders such that 80% of the images go to the training set and 20% of the images go into the validation set. In the end our directory will have the following structure:\n",
        "\n",
        "\n",
        "<pre style=\"font-size: 10.0pt; font-family: Arial; line-height: 2; letter-spacing: 1.0pt;\" >\n",
        "<b>flower_photos</b>\n",
        "|__ <b>daisy</b>\n",
        "|__ <b>dandelion</b>\n",
        "|__ <b>roses</b>\n",
        "|__ <b>sunflowers</b>\n",
        "|__ <b>tulips</b>\n",
        "|__ <b>train</b>\n",
        "    |______ <b>daisy</b>: [1.jpg, 2.jpg, 3.jpg ....]\n",
        "    |______ <b>dandelion</b>: [1.jpg, 2.jpg, 3.jpg ....]\n",
        "    |______ <b>roses</b>: [1.jpg, 2.jpg, 3.jpg ....]\n",
        "    |______ <b>sunflowers</b>: [1.jpg, 2.jpg, 3.jpg ....]\n",
        "    |______ <b>tulips</b>: [1.jpg, 2.jpg, 3.jpg ....]\n",
        " |__ <b>val</b>\n",
        "    |______ <b>daisy</b>: [507.jpg, 508.jpg, 509.jpg ....]\n",
        "    |______ <b>dandelion</b>: [719.jpg, 720.jpg, 721.jpg ....]\n",
        "    |______ <b>roses</b>: [514.jpg, 515.jpg, 516.jpg ....]\n",
        "    |______ <b>sunflowers</b>: [560.jpg, 561.jpg, 562.jpg .....]\n",
        "    |______ <b>tulips</b>: [640.jpg, 641.jpg, 642.jpg ....]\n",
        "</pre>\n",
        "\n",
        "Since we don't delete the original folders, they will still be in our `flower_photos` directory, but they will be empty. The code below also prints the total number of flower images we have for each type of flower. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "a-AL030LmcdD"
      },
      "outputs": [],
      "source": [
        "for cl in classes:\n",
        "  img_path = os.path.join(base_dir, cl)\n",
        "  images = glob.glob(img_path + '/*.jpg')\n",
        "  print(\"{}: {} Images\".format(cl, len(images)))\n",
        "  train, val = images[:round(len(images)*0.8)], images[round(len(images)*0.8):]\n",
        "\n",
        "  for t in train:\n",
        "    if not os.path.exists(os.path.join(base_dir, 'train', cl)):\n",
        "      os.makedirs(os.path.join(base_dir, 'train', cl))\n",
        "    shutil.move(t, os.path.join(base_dir, 'train', cl))\n",
        "\n",
        "  for v in val:\n",
        "    if not os.path.exists(os.path.join(base_dir, 'val', cl)):\n",
        "      os.makedirs(os.path.join(base_dir, 'val', cl))\n",
        "    shutil.move(v, os.path.join(base_dir, 'val', cl))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8Lp-0ejxOtP1"
      },
      "source": [
        "For convenience, let us set up the path for the training and validation sets"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "uh68rmWspp0U"
      },
      "outputs": [],
      "source": [
        "train_dir = os.path.join(base_dir, 'train')\n",
        "val_dir = os.path.join(base_dir, 'val')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UOoVpxFwVrWy"
      },
      "source": [
        "# Data Augmentation"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Wn_QLciWVrWy"
      },
      "source": [
        "Overfitting generally occurs when we have small number of training examples. One way to fix this problem is to augment our dataset so that it has sufficient number of training examples. Data augmentation takes the approach of generating more training data from existing training samples, by augmenting the samples via a number of random transformations that yield believable-looking images. The goal is that at training time, your model will never see the exact same picture twice. This helps expose the model to more aspects of the data and generalize better.\n",
        "\n",
        "In **tf.keras** we can implement this using the same **ImageDataGenerator** class we used before. We can simply pass different transformations we would want to our dataset as a form of arguments and it will take care of applying it to the dataset during our training process. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2uJ1G030VrWz"
      },
      "source": [
        "## Experiment with Various Image Transformations\n",
        "\n",
        "In this section you will get some practice doing some basic image transformations. Before we begin making transformations let's define our `batch_size` and our image size. Remember that the input to our CNN are images of the same size. We therefore have to resize the images in our dataset to the same size.\n",
        "\n",
        "### TODO: Set Batch and Image Size\n",
        "\n",
        "In the cell below, create a `batch_size` of 100 images and set a value to `IMG_SHAPE` such that our training data consists of images with width of 150 pixels and height of 150 pixels."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "QyPkET61yMMX"
      },
      "outputs": [],
      "source": [
        "batch_size =\n",
        "IMG_SHAPE = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rlVj6VqaVrW0"
      },
      "source": [
        "### TODO: Apply Random Horizontal Flip\n",
        "\n",
        "In the cell below, use ImageDataGenerator to create a transformation that rescales the images by 255 and then applies a random horizontal flip. Then use the `.flow_from_directory` method to apply the above transformation to the images in our training set. Make sure you indicate the batch size, the path to the directory of the training images, the target size for the images, and to shuffle the images. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Bi1_vHyBVrW2"
      },
      "outputs": [],
      "source": [
        "image_gen = \n",
        "\n",
        "train_data_gen = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zJpRSxJ-VrW7"
      },
      "source": [
        "Let's take 1 sample image from our training examples and repeat it 5 times so that the augmentation can be applied to the same image 5 times over randomly, to see the augmentation in action."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "jqb9OGoVKIOi"
      },
      "outputs": [],
      "source": [
        "# This function will plot images in the form of a grid with 1 row and 5 columns where images are placed in each column.\n",
        "def plotImages(images_arr):\n",
        "    fig, axes = plt.subplots(1, 5, figsize=(20,20))\n",
        "    axes = axes.flatten()\n",
        "    for img, ax in zip( images_arr, axes):\n",
        "        ax.imshow(img)\n",
        "    plt.tight_layout()\n",
        "    plt.show()\n",
        "\n",
        "\n",
        "augmented_images = [train_data_gen[0][0][0] for i in range(5)]\n",
        "plotImages(augmented_images)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qXnwkzFuVrXB"
      },
      "source": [
        "### TODO: Apply Random Rotation\n",
        "\n",
        "In the cell below, use ImageDataGenerator to create a transformation that rescales the images by 255 and then applies a random 45 degree rotation. Then use the `.flow_from_directory` method to apply the above transformation to the images in our training set. Make sure you indicate the batch size, the path to the directory of the training images, the target size for the images, and to shuffle the images. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1zip35pDVrXB"
      },
      "outputs": [],
      "source": [
        "image_gen = \n",
        "\n",
        "train_data_gen = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "m2lNPWFc3Bre"
      },
      "source": [
        "Let's take 1 sample image from our training examples and repeat it 5 times so that the augmentation can be applied to the same image 5 times over randomly, to see the augmentation in action."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "wmBx8NhrVrXK"
      },
      "outputs": [],
      "source": [
        "augmented_images = [train_data_gen[0][0][0] for i in range(5)]\n",
        "plotImages(augmented_images)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NvqXaD8BVrXN"
      },
      "source": [
        "### TODO: Apply Random Zoom\n",
        "\n",
        "In the cell below, use ImageDataGenerator to create a transformation that rescales the images by 255 and then applies a random zoom of up to 50%. Then use the `.flow_from_directory` method to apply the above transformation to the images in our training set. Make sure you indicate the batch size, the path to the directory of the training images, the target size for the images, and to shuffle the images. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "tGNKLa_YVrXR"
      },
      "outputs": [],
      "source": [
        "image_gen = \n",
        "\n",
        "train_data_gen = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XuoAX0Za4zTk"
      },
      "source": [
        "Let's take 1 sample image from our training examples and repeat it 5 times so that the augmentation can be applied to the same image 5 times over randomly, to see the augmentation in action."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "-KQWw8IZVrXZ"
      },
      "outputs": [],
      "source": [
        "augmented_images = [train_data_gen[0][0][0] for i in range(5)]\n",
        "plotImages(augmented_images)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OC8fIsalVrXd"
      },
      "source": [
        "### TODO: Put It All Together\n",
        "\n",
        "In the cell below, use ImageDataGenerator to create a transformation that rescales the images by 255 and that applies:\n",
        "\n",
        "- random 45 degree rotation\n",
        "- random zoom of up to 50%\n",
        "- random horizontal flip\n",
        "- width shift of 0.15\n",
        "- height shift of 0.15\n",
        "\n",
        "Then use the `.flow_from_directory` method to apply the above transformation to the images in our training set. Make sure you indicate the batch size, the path to the directory of the training images, the target size for the images, to shuffle the images, and to set the class mode to `sparse`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "gnr2xujaVrXe"
      },
      "outputs": [],
      "source": [
        "image_gen_train = \n",
        "\n",
        "\n",
        "train_data_gen = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AW-pV5awVrXl"
      },
      "source": [
        "Let's visualize how a single image would look like 5 different times, when we pass these augmentations randomly to our dataset. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "z2m68eMhVrXm"
      },
      "outputs": [],
      "source": [
        "augmented_images = [train_data_gen[0][0][0] for i in range(5)]\n",
        "plotImages(augmented_images)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "a99fDBt7VrXr"
      },
      "source": [
        "### TODO: Create a Data Generator for the Validation Set\n",
        "\n",
        "Generally, we only apply data augmentation to our training examples. So, in the cell below, use ImageDataGenerator to create a transformation that only rescales the images by 255. Then use the `.flow_from_directory` method to apply the above transformation to the images in our validation set. Make sure you indicate the batch size, the path to the directory of the validation images, the target size for the images, and to set the class mode to `sparse`. Remember that it is not necessary to shuffle the images in the validation set. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "54x0aNbKVrXr"
      },
      "outputs": [],
      "source": [
        "image_gen_val = \n",
        "val_data_gen = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wEgW4i18VrWZ"
      },
      "source": [
        "# TODO: Create the CNN\n",
        "\n",
        "In the cell below, create a convolutional neural network that consists of 3 convolution blocks. Each convolutional block contains a `Conv2D` layer followed by a max pool layer. The first convolutional block should have 16 filters, the second one should have 32 filters, and the third one should have 64 filters. All convolutional filters should be 3 x 3. All max pool layers should have a `pool_size` of `(2, 2)`.\n",
        "\n",
        "After the 3 convolutional blocks you should have a flatten layer followed by a fully connected layer with 512 units. The CNN should output class probabilities based on 5 classes which is done by the **softmax** activation function. All other layers should use a **relu** activation function. You should also add Dropout layers with a probability of 20%, where appropriate. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Evjf8jZk2zi-"
      },
      "outputs": [],
      "source": [
        "model = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DADWLqMSJcH3"
      },
      "source": [
        "# TODO: Compile the Model\n",
        "\n",
        "In the cell below, compile your model using the ADAM optimizer, the sparse cross entropy function as a loss function. We would also like to look at training and validation accuracy on each epoch as we train our network, so make sure you also pass the metrics argument."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "08rRJ0sn3Tb1"
      },
      "outputs": [],
      "source": [
        "# Compile the model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oub9RtoFVrWk"
      },
      "source": [
        "# TODO: Train the Model\n",
        "\n",
        "In the cell below, train your model using the **fit_generator** function instead of the usual **fit** function. We have to use the `fit_generator` function because we are using the **ImageDataGenerator** class to generate batches of training and validation data for our model. Train the model for 80 epochs and make sure you use the proper parameters in the `fit_generator` function."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "tk5NT1PW3j_P"
      },
      "outputs": [],
      "source": [
        "epochs = \n",
        "\n",
        "history = "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LZPYT-EmVrWo"
      },
      "source": [
        "# TODO: Plot Training and Validation Graphs.\n",
        "\n",
        "In the cell below, plot the training and validation accuracy/loss graphs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "8CfngybnFHQR"
      },
      "outputs": [],
      "source": [
        "acc = \n",
        "val_acc = \n",
        "\n",
        "loss = \n",
        "val_loss = \n",
        "\n",
        "epochs_range = \n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "E9FROGm-KGEG"
      },
      "source": [
        "# TODO: Experiment with Different Parameters\n",
        "\n",
        "So far you've created a CNN with 3 convolutional layers and followed by a fully connected layer with 512 units. In the cells below create a new CNN with a different architecture. Feel free to experiment by changing as many parameters as you like. For example, you can add more convolutional layers, or more fully connected layers. You can also experiment with different filter sizes in your convolutional layers, different number of units in your fully connected layers, different dropout rates, etc... You can also experiment by performing image augmentation with more image transformations that we have seen so far. Take a look at the [ImageDataGenerator Documentation](https://keras.io/preprocessing/image/) to see a full list of all the available image transformations. For example, you can add shear transformations, or you can vary the brightness of the images, etc... Experiment as much as you can and compare the accuracy of your various models. Which parameters give you the best result?"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [],
      "name": "l05c03_exercise_flowers_with_data_augmentation.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
